-
Notifications
You must be signed in to change notification settings - Fork 28k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-13610][ML] Create a Transformer to disassemble vectors in Data… #16486
Conversation
I don't think this is worth adding. It's pretty easy to pull out a single fiedl from a vector already. |
It's a method like VectorAssembler, which make user easy to handle single fields and vector field. Our business use disassemble transform a lot, it need always handle by write some code, this Transformer will make user easy to understand and use, right? |
@mengxr, could you help to check this patch? Thanks |
@jkbradley, Could you also help to check this patch cause you are familiar with this defect, Thanks. |
I could use this. I have udf to pick out single values I want but my implementation is slow: here is my python udf: |
@mrjrdnthms , this is implemented by UDF, which will run a little bit slower, but easy to use. |
@leonfl The python udf is too slow for my task. By "mappatition and row iterator" do you mean doing the transformation on the RDD directly instead of the dataframe? Sorry for the basic question. I am new to spark. And thanks for help. |
@mrjrdnthms ,Yes, your understand is correct, in scala it like this:
|
Can one of the admins verify this patch? |
Was this ever implemented? |
such a great transformer, don't understand why they chose to ingore this patch. |
It is not possible to retrieve a single element from VectorAssembler, it's only possible to retrieve a subset of the array, but it is still an array the element |
JIRA Issue: https://issues.apache.org/jira/browse/SPARK-13610
What changes were proposed in this pull request?
Add a VectorDisassembler used for disassemble the vector field to single fields.
How was this patch tested?
Unit tests have added into ml for this feature.